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Abstract 

We suggest an algorithm for determining the proper delay time and the 
minimum embedding dimension for Takens' delay-time embedding procedure. 
This method resorts to the rate of change of the spatial distribution of points 
on a reconstructed attractor with respect to the delay time, and can be suc- 
cessfully applied to a noisy time series which is too noisy to be discriminated 
from a structureless noisy time series by means of the correlation integral, 
and also indicates that the proper delay time depends on the embedding di- 
mension. 
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Since the discovery of the method to reconstruct state spaces from a time series 
various attempts have been implemented to choose the embedding parameters, the proper 
delay time and the minimum embedding dimension p|-|T0f. By the theorem of Takens [0, 



the reconstructions are generically diffeomorphic to the original dynamics if d > 2d a where 
is the dimension of the compact invariant smooth manifold A corresponding to the 
continuous time dynamical system and d is the embedding dimension. From the theorem, 
one can not, however, determine the delay time and the minimum embedding dimension. 
Nevertheless because the quality of the reconstructed attractor affects strongly the success 
of noise reduction processes JLTJ] and forecasting methods as well as accurate calculations of 
various invariant quantities such as Lyapunov exponents, it is essential to choose the proper 
embedding parameters before carrying out the above mentioned jobs. 

In computational analysis with a noisy time series, because what the proper delay time 
is depends on the purpose for reconstructing attractors, it is suggested that the delay time 
is much less critical to success of the reconstruction process than the embedding dimension. 

In general, with the chosen delay time too small, the reconstructed points coincide ap- 
proximately within the range of errors so that the reconstructed attractor appears "stretched 
out and randomly crowded" along the identity line in the time-delayed coordinates. If the 
delay time is too large, operations such as Grassberger-Procaccia algorithm [O for the cor- 



relation dimension can be lead to wrong results [13] due to the nonlinearity of the embedding 
process not of the original dynamics. 

On the other hand, working in any larger dimension than the minimum leads to a more 
computational burden in obtaining the invariant quantities such as Lyapunov exponents, 
prediction, etc. And it also enhances the problem of contamination by round-off or mea- 
surement errors since the noises will populate and dominate the additional dimensions of 
the embedding space. The most important reason for looking for the minimum embedding 
dimension is to obtain an intuition for understanding and modeling the underlying dynamics 
of a noisy time series. Also in many schemes for noise reduction, the first job to do is to 



have at least a vague idea about the minimum embedding dimension [|TT . 

In this paper, we propose a method to determine simultaneously the minimum embedding 
dimension du and the proper delay time for a noisy time series. The suggested method 
can be applied successfully to a noisy time series contaminated by a measurement noise of 
which the size is so much large that one can not disentangle the deterministic part from the 
random part by means of the correlation integral ||14j| . Also on the basis of this method, 
we find that the proper window length W d = (d - l)T d d remains approximately constant if 
d > d M . 

Let a continuous signal w(t) be measured at each "sampling interval" T s to yield a time 
series {v{k)}\ 

{v(k)} = {v(k)\v(k)=w(kT s ), fc=l,2,...,iV T }, (1) 

where Nt is the total number of time series. We then assign to x(t) the delay vector: 

y(p\d, T d ) = (v(l + (p - l)j),v(l + (p - l)j + T d ), ■ ■ ■ , v(l + (p - l)j + (d- l)T d )) (2) 

p = l,2,...,JV 

In y(p\d, To), d and T d represent the embedding dimension and the delay time in the unit 
of the sampling time T s , respectively, j is the interval in the unit of sampling time T s 
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between the first components of successive delay vectors, and describes how the time series 
is sampled to create a set of delay vectors with a computationally adequate size from a 
given time series. The time (d — l)T d , spanned by each delay vector, is called the "window 
length" of the embedding |5|Jl9|]. In general, the coordinates of the delay vector made in 
this way are nonlinearly represented by the original dynamical variables. This is one of the 
reasons that the singular value decomposition is not appropriate as a method to choose the 
proper embedding dimension JT^JT^] . Also once the Takens' criterion d > 2d a is satisfied 
and the delay time is not too small or too large, it is difficult to say which delay time is the 
most proper. Whether the delay time is the most adequate or not depends on the operation 
applied to the reconstructed attractor. 

In this paper, we propose as the criterion for the proper value of the delay time T d the first 
minimum of the rate of change of the distribution of points on a reconstructed attractor as 
a function of the delay time T d . The increase of the delay time T d from enough small a value 
let the reconstructed points go away from the identity line in the reconstructed coordinates, 
that is, the reconstructed attractor expands spatially in the reconstructed coordinates. Due 
to the boundness of attractors, the rate of expansion monotonically decreases and the folding 
effect becomes larger as the delay time T d increases. At last, the folding effect is equal to the 
expanding one. At this point, the rate of variation of the spatial distribution of attractor 
becomes the first minimum value T$. This criterion we suggest is almost equivalent to the 
maximal expansion condition of the reconstructed attractor. 

To infer the distribution of points of a attractor, we introduce D(N,r,d,Td) which is 
defined as the fraction of pairs of delay vectors between which the distance lies from r to 
r + Ar: 

9 N-l i-1 

D(N,r,d,T d ) = £ £F(r), (3) 



i=0 j=Q 



where 

H(r) = { 1 if r < y/\y(i\d, T d ) - y(j\d, T d )\*/d < r + Ar (4) 
( otherwise 

with Ar small enough. In Eq.(^), | ■ | 2 means the square of Euclidean distance between 
y{i\d,T d ) and y(j\d,T d ) and is devided by the embedding dimension d in order to eliminate 
the effect that the size of a reconstructed attractor becomes larger in a higher embedding 
dimension d. Fixing the embedding dimension d, at each delay time T d we calculate Eq.@ 
and the sum of diffenences between the adjacent delay times, S(N, d,W = (d — l)T d ): 

S(N, d,W=(d- l)T d ) = / \D(N, r, d, T d + AT d ) - D(N, r, d, T d )\dr 

Jo 

R A /Ar 

«Ar £ \D(N,kAr,d,T d + AT d )-D(N,kAr,d,T d )\ (5) 
k=i 



where Ra = y R 2 /d with R the largest length among the distances between any two points, 
and ATd and Ar are small enough. The first minimum point Tf of S(N,d,W = (d — 1)T^) 
corresponds to the proper delay time in the embedding dimension d. 
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On the other hand, while reconstructed attractors with an embedding dimension d less 
than the minimum embedding dimension du have geometric overlaps, attractors recon- 
structed under the condition of d > du do not have any overlap, and are diffeomorphic to 
each other. These facts mean that S(N, d > du-, W)s' show a similar behaviour with respect 
to the window length W — (d— l)Td, at least until the first minimum point W — (d— l)T d 
of S(N, d, W). After the first minimum point W d = (d — 1)T$ is passed through, the fold- 
ing effect of the embedding procedure dominates the expanding effect. The folding effect 
raises an unexpected result in the calculation of D(N,r,d,Td) because in the range from r 
to r + Ar, points even from a folded part are counted. This is just the reason that if the 
window length is too large, Grassberger-Procaccia algorithm for the correlation dimension 
can be lead to wrong results [13|]. So we can choose simultaneously the proper embedding 
parameters through the calculation of S(N, d, W)s\ Fig.0 illustrates all the above mensioned 
phenomena. In the Fig.[l], we consider the Hyperchaos system of Rossler [ 20| : 



x = —y — z, y = x + 0.25?/ + w, 

z = 3 + xz, w = —0.5,2 + 0.05k;, 

with the sampling time T s — 0.1 and the initial condition (—20, 0, 0, 15). We acquire 50000 
points with v(k) = x{kT s ) and then for calculating D(N, r, d, T d ), 5000 points are picked up 
from the data set, satisfied by 

= [N T - (d - l)T d - 1} 

N — 1 1 ' 

in Eq. (fj). It is important to evenly pick up the data points because the full time series has 
to be reflected. As the algorithm to integrate the differential equations, we adopt the fourth 
order Runge-Kutta method KTJ . 

In this paper, we want to place emphasis on the robustness of this method against 
measurement noise. A time series V(k) contaminated by a measurement noise is represented 
as follows: 



V(k) = v(k) + X(k), (7) 

where v(k) is a noise-free time series and X(k) a measurement noise. The effect of measure- 
ment noise X(k) in Eq.(^) in the reconstruction procedure results only in the thickness of 
the reconstructed trajectory. However, in the variation with respect to the delay time T d 
(Eq.(||)), the thickness due to noise is canceled out because it has nothing to do with the de- 
lay time Td- To show the robustness against noise, we make a time series V(k) = x(k) + X(k), 
where x(k) is just the time series for Fig.^Jand X(k) is a gaussian white noise with the mean 
'0' and the variance '25'. This time series V(k) is too noisy to be discriminated from a 
structureless noisy time series by means of the correlation integral (Fig.^). The result is 
shown in Fig.|3]. One should notice that the trend of variation is totally similar to the one 
of the measurement noise-free time series(Fig.|l]). 

In evaluating D(N, r, d, Td), it takes exponentially increasing time according to the num- 
ber iV of picked- up points. Therefore, if one wants to pay a lower cost for the D(N, r, d, Td), 
it is important to pick up points as few as possible, which depends closely on what values of 
ATd and Ar are set. Because the purpose of calculation of D(N, r, d, Td) is to infer globally 
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the spatial distribution of a reconstructed attractor, not to examine the fine structure, it is 
sufficient to set Ar « R/50, where R is the largest length among the distances between any 
two points in the reconstructed attractor. On the other hand, using the power spectrum of 
time series, the AT^ can be properly chosen. This is because in d = 2, the first minimum 
point T$ =2 is, on the average, related to the temporal fluctuation of time series. As a guide 
line for the choice of AT^, we advise 1/(4/^ AT^) ~ 10, where fh means the highest frequency 
in the power spectrum of the time series. The contribution of background noise has to not 
be considered. This condition corresponds to W d=2 ~ lOAT^. The sampling time T s has 
to be < ATrf. With these crude guidance, we chose the values for AT^ and Ar in Fig.[I] 
and Fig.[| As the last and the most important step, with the roughly chosen AT^ and Ar, 
one has to find the proper number N of picked-up points. This is easily done by finding 
when D(N,r,d = 2, To) is saturated. This step is shown in Fig.[| which indicates us that 
the proper N for Fig.|I| is about 3000, even though we use iV = 5000 in Fig.jl]. 
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FIG. 1. S(N, d, W) vs window length W for N = 5000, evenly picked up among 50000 points 
generated by the Hyperchaos system of Rdssler. For the sake of comparison, each S(N, d, W) was 
appropriately scaled. The thick arrow points the proper window length W d ss 60 (d > d M )- We can 
identify the true value of the minimum embedding dimension dj^ = 4. Therefore the proper delay 
time is W d=4 /3 = T$= A m 20. On the other hand, the proper window lengths in the embedding 
dimension d = 2 and d = 3 are W d=2 « 20 and W d=3 « 45, respectively. Ar = 2, AT d = T S = 0.1. 
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FIG. 2. Slope= G?[log 2 Cd(r)]/dlog 2 (r) vs log 2 (r) for 50000 points generated by the Hyperchaos 
system of Rossler and Slope vs log 2 (r) for 50000 points generated by the Hyperchaos system which 
has been contaminated with the gaussian white noise of the mean and the variance 25. Cd means 
the correlation integral with the embedding dimension d. While the plot of slope vs log 2 (r) for 
the noise-free data tells that the corrlation dimension is about 3, the plot for the noisy data is 
completely the one for a structureless random noisy data. The delay time Td is 30. 
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FIG. 3. S(N, d, W) vs window length W for N = 49500 among 50000 points generated by the 
Hyperchaos system of Rossler which has been contaminated with the gaussian white noise of the 
mean and the variance 25. For the sake of comparison, each S(N, d, W) was appropriately scaled. 
Despite of the presence of the measurement noise, everything is almost the same as Fig. |l|. We 
can say that d M = 4 and Tj =A k, 20. Ar = 2, AT d = T S = 0.1. 
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FIG. 4. S(N, d = 2, W) vs window length W for various values of N, evenly picked up among 
50000 points generated by the Hyperchaos system of Rossler with Ar = 2, AT^ = T s = 0.1. The 
curves for S(N, d = 2, W) are saturated at about N = 3000. 
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